The code is the documentation

2018-09-12

The first time I heard someone saying: “the code is the documentation”, I thought it sounded completely wrong, like a lazy excuse for not producing documentation. However, it kept me thinking and I realised that there is also truth in this statement. This paradoxical thought-provoking quality makes it a proper mantra for agile practitioners, because it expresses a fundamental agile value: “working software over comprehensive documentation“.

Before we go further into details, I should dismiss the notion that agile developers don’t write documentation or misconceive its worth. Nope. We still produce documentation. However, we also apply the following agile principle to it: “Simplicity, the art of maximising the amount of work not done, is essential.” It is expedient to minimise documentation by adhering to practices that reduce the need for it. In agile development, documentation takes on the role of accessory parts whereas the primary attention is given to the codebase. Or more succinctly:

“Truth can only be found in one place: the code.” (Robert C. Martin, Clean Code)

paper stack

The codebase is the ultimate source of truth. It is referenced in case of doubt, when a question is either too detailed or if the documentation is out-of-date. The code provides insight where no other method of reference is available. With this in mind, it is evident that code should be written in a way that is well-structured and understandable. Self-documenting code reduces or disposes the need for external documentation. Knowledge is represented in a single artefact and there is no need to synchronise multiple sources. Unfortunately, most real-world codebases do not have these ideal qualities. There could be many reasons for that, but the most common reason is that software entropy has taken its toll over time, and that too little attention was paid to refactoring.

So, code quality, self-documenting code, and continuous refactoring go hand in hand. It is important to understand that the highest code quality is achieved through conceptual clarity. Conceptual clarity comes from good naming and good structure. This is by far more important than coding style, naming conventions, formatting and other external features, although the latter do of course contribute to code quality. Naming and structure cannot be tested automatically. Unlike code style, conventions, and formatting, they require human perception and intelligence. Which is one more reason to adopt such practices as code reviews and pair programming.

Coming back to “the code is the documentation”, I think the best way to understand this phrase is as an abstract ideal that ought to be worked towards. In an ideal world, code doesn’t need additional documentation, because it is so beautifully clear that it can be understood by anyone without any prior knowledge. It answers all questions that might arise about the software. It clarifies the intentions of the programmer and it is therefore also easy to maintain and change. Obviously, this is really difficult to achieve in the real world, especially across a large codebase, but its self-documenting properties are quite likely the best measure of code quality.